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Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) OR THIRTY (30) DAYS. 

WHICHEVER IS LONGER, FROM THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1 .136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 

- Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 1 33). 
Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1.704(b). 

Status 

1 )KI Responsive to communication(s) filed on 24 September 2007 . 
2a)K This action is FINAL. 2b)n This action is non-final. 

3) D Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quayle, 1935 CD. 1 1 , 453 O.G. 213. 

Disposition of Claims 

4) KI Claim(s) 1,3-5,7-9.11-14,16,17,20'33,35,68'71 and 73-80 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) n Claim(s) is/are allowed. 

6) [E1 Claim(s) 1,3-5. 7-9. 1 1-14, 16, 1 7,20-33,35. 68-71 and 73-80 is/are rejected. 

Claim(s) is/are objected to. 

8) 0 Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) 0 The specification is objected to by the Examiner. 

10)D The drawing(s) filed on is/are: a)\3 accepted or b)\3 objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1.85(a). 

Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 
1 1 )□ The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-1 52. 

Priority under 35 U.S.C. § 119 

12)n Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 19(a)-(d) or (f). 
a)n All b)n Some * c)n None of: 

1 .□ Certified copies of the priority documents have been received. 

2. n Certified copies of the priority documents have been received in Application No. . 

3. n Copies of the certified copies of the priority documents have been received in this National Stage 

application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 

Application Bacl(ground 

1 . This action is responsive to the amendment filed on 9/24/2007. 

2. Applicant has cancelled claims 2, 6, 10, 18 and 19, amended claims 1, 14, 16, . 
17, 20, 31, 33, 68, 71 and 74, and added new claims 76-80. Claims 15, 34, 36-67 
and 72 were previously canceled. 

3. Claims 1, 3-5, 7-9. 11-14, 16, 17, 20-33, 35, 68-71 and 73-80 are pending in the 
case, claims 1, 31, 68 and 74 are independent claims. 

4. Acl^nowledgement is made to the applicant's submission of an Information 
Disclosure Statement filed 11/14/2007. The foreign patent document listed therein 
has not been considered because the foreign patent document is written in a 
language other than English, and applicant has failed to provide a concise 
explanation of the relevance, as it is presently understood by applicant, or a written 
translation. See MPEP 609 and 37 CFR 1.98. 

5. The examiner's rejection of claims 2, 6, 10, 18 and 19, made under 35 USC 102, 
as described in the Claims Rejections - 35 USC 102 section of the previous office 
action (dated 3/22/2007) is withdrawn in view of the canceled claims. 

6. The examiner's rejection of claims 1, 3-5, 7-9, 11-14, 16, 17. 20-33, 35, 68-71, 73 
and 74, made under 35 USC 102, as described in the Claims Rejections - 35 USC 
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102 section of the previous office action (dated 3/22/2007) is withdrawn in view of 
the amended claims. 

Claim Rejections - 35 USC § 112 

7. The following is a quotation of the first paragraph of 35 U.S.C. 112: 

''The specification sliall contain a written description of tlie invention, and of ttie 
manner and process of making and using it, in sucli full, clear, concise, and exact 
terms as to enable any person skilled in the art to which it pertains, or with which 
it is most nearly connected, to make and use the same and shall set forth the 
best mode contemplated by the inventor of carrying out his invention." 

8. Claims 76-80 are rejected under 35 U.S.C. 112, first paragraph, as failing to 
comply with the written description requirement. The claims contains subject matter, 
which was not described in the specification in such a way as to reasonably convey 
to one skilled in the relevant art that the inventor(s), at the time the application was 
filed, had possession of the claimed invention. 

9. Regarding claim 76, the amendment filed 9/24/2007 adds the following 
limitations: "w'swa/ blocks are specified' (first limitation) and ''separators are 
specified' (second limitation). The examiner has reviewed the originally filed 
specification, and has failed to find support for the added limitations. Applicant is 
required to cancel the new matter in response to this office action. 

10. Regarding claim 77, the claim is rejected for the fully incorporating the 
deficiencies of claim 76. and the amendment filed 9/24/2007 adds the following 
limitation: "the separator specification comprises a specification of a display area". 
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The examiner has reviewed the originally filed specification, and has failed to find 
support for the added limitations. Applicant is required to cancel the new matter in 
response to this office action. 

11. Regarding claim 78, the claim is rejected for the fully incorporating the 
deficiencies of claims 76 and 77, and the amendment filed 9/24/2007 adds the 
following limitation: *'the specification of the display area comprises a specification of 
a start pixel and a specification of an end pixet. The examiner has reviewed the 
originally filed specification, and has failed to find support for the added limitations. 
Applicant is required to cancel the new matter in response to this office action. 

12. Regarding claims 79 and 80, the amendment filed 9/24/2007 adds the following 
limitation: Initializing a specification of an initial separator"'. The examiner has 
reviewed the originally filed specification, and has failed to find support for the added 
limitations. Applicant is required to cancel the new matter in response to this office 
action. 

1 3. The following is a quotation of the second paragraph of 35 U.S.C. 112: 

"The specification shall conclude with one or more claims particularly pointing out 
and distinctly claiming the subject matter which the applicant regards as his 
invention." 

14. Claims 68-71 and 73-75 are rejected under 35 U.S.C. 112, second paragraph, as 
being incomplete for omitting essential elements, such omission amounting to a gap 
between the elements. See MPEP § 2172.01. The omitted elements are the system 
components prescribed in the preamble of independent claims 68 and 74. The 
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limitations of independent claims 68 and 74 are directed to logical components 
embodied on computer readable media (i.e. a visual block extractor, a visual 
separator detector, a content structure constructor (claim 68) or means for 
identifying, means for detecting, and means for constructing (claim 74)). However, 
these logical components fail to include the basic components necessary to describe 
a computer implemented system for vision based document segmentation. 
Dependent claims 69-71 and 73-74 fall to remedy the deficiencies of the base 
claims. 

Claim Rejections - 35 USC § 103 

15. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed 
or described as set forth in section 102 of this title, if the differences between the 
subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made 
to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was 
made. 

16. Claims 1, 3-5, 7-9, 11-14, 16, 17, 20-33, 35, 68-71 and 73-80 are rejected under 
35 U.S.C. 102(b) as being anticipated by Yang et al. "HTML Page Analysis Based 
on Visual Cues" from the 6''' International Conference on Document Analysis and 
Recognition (ICDAR 2001), Seattle, Washington, USA, Copyright 2001 (hereinafter 
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Yang) in view of Ma et al., US Patent Publication 2004/0013302, filed 11/13/2002, 
published 1/22/2004 (hereinafter Ma). 

17. Regarding independent claim 1, Yang discloses identifying a plurality of visual 
blocks in a document and detecting, distinct from the plurality of visual blocks, one or 
more separators of the document based on, at least, one or more characteristics of 
at least one of the plurality of visual blocks. Yang recites: "records in one category 
are normally organized in ways having a consistent visual layout style. Boundaries 
between different categories are madded apparently with different visual styles or 
separators. As we have said, the basic idea of our approach is to detect these visual 
cues" (page 2, left column, third paragraph). 

Yang discloses constructing a content structure for the document. Yang recites: 
"in section 3 we introduce our heuristics. After that, we talk about our method to 
detect visual patterns and then to construct document structures based on these 
heuristics" (page 2, left column, second paragraph). Yang discloses the content 
structure identifying, for the different visual blocks, different portions of semantic 
content. Yang recites: "In this paper, we propose a novel method to extract semantic 
structures from general HTML pages. This method doesn't require a prion 
knowledge of web pages. It uses features derived directly from layout of HTML 
pages" (page 1 , last paragraph to page 2, first paragraph). 

Yang's identifying step uses visual cues, and fails to disclose identifying based 
on a document model of the document. Ma teaches the use of a document model to 
identify blocks of a document. Ma recites: "The segmentation and, optionally, OCR 
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results 18 are matched to one or more document models in the classification and 
labeling process performed by matching module 20" (paragraph 21). Therefore, it 
would have been obvious, to one of ordinary skill in the art, at the time the invention 
was made to identify document segments by using a document model in order "to 
generate an identified, segmented document (Ma, paragraph 9). 

18. Regarding dependent claim 3, Yang discloses a document described by a tree 
structure having a plurality of nodes. Yang's process is directed toward HTML 
documents. HTML documents are inherently processed by computers in a well- 
known process commonly referred to as parsing. Yang discloses parsing. Yang 
recites: ''the process to parse HTML documents and extract simple objects" (page 2, 
right column, last paragraph). Parsing is a process where elements of a markup 
language document are placed into a tree structure in a relative way (i.e. there is a 
first or root element, with subsequent elements being related to the first as child, and 
where child elements can further have children). These elements are commonly 
referred to as nodes; Yang discloses identifying a group of candidate nodes, and for 
each node in the group: determining whether the node can be divided, and if the 
node cannot be divided, identifying the node as a visual block. Yang recites: "During 
the process to parse HTML documents and to extract simple objects" (page 2, right 
column, last paragraph), where Yang describes the simple object as ''None- 
breakable visual HTML objects" (page 2, left column, last paragraph). 
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19. Regarding dependent claim 4, Yang discloses setting a degree of coherence 
for the visual block. The specification defines the degree of coherence as "a 
measure of how coherent the visual block is'* (page 15, lines 13-14). Yang recites: ">A 
modifier equals to zero means that two objects are distinct or can't be compared' 
(page 2, right column, last paragraph). 

20. Regarding claims 5, 7-9, and 11-13, Yang discloses dividing nodes into their 
respective child nodes based on criteria related to tags and node properties 
(including colors and sizes) on page 2, the bottom of the left column to the bottom of 
the right column. 

21. Regarding claims 14, 16, 17, and 20-30, Yang discloses detecting the one or 
more separators. Yang recites: ''Boundaries between different categories are 
marked apparently with different visual styles or separators. As we have said, the 
basic idea of our approach is to detect these visual cues" (page 2, left column, third 
paragraph). See also Figure 3a on page 6, where multiple visual blocks are 
identified where the separators between the blocks are both horizontal and vertical 
in nature. 

22. Regarding independent claims 31, 68 and 74, the claims are directed toward a 
computer-readable media or a system, for the method of claim 1, and are rejected 
using the same rationale. 
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23. Regarding claims 32, 70 and 75. the claims are directed toward a computer 
readable media or a system, for the method of claim 3, and are rejected using the 
same rationale. 

24. Regarding claims 33 and 71. the claims are directed toward a computer 
readable media or a system, for the method of claim 14, and are rejected using the 
same rationale. 

25. Regarding claims 35 and 73, the claims are directed toward a computer 
readable media or a system, for the method of claims 4 and 26 combined, and are 
rejected using the same rationale. 

26. Regarding claim 69, Yang discloses a document retrieval module that retrieves 
documents from a plurality of documents based at least in part on the content 
structure constructed for one or more of the plurality of documents. Yang discloses 
the 'World Wide Web" with documents encoded as ^'markup languages like HTML 
intending for visual browsers" (page 1 , first paragraph of the Introduction section). A 
browser is well known for retrieving documents (web pages) from a plurality of 
documents (the World Wide Web), based on the content structure (the markup 
language). 

27. Regarding claim 76-78, Yang discloses visual analysis of documents where the 
boundaries and other document objects are visual cues to the document semantics. 
Ma teaches the use of document models in the visual analysis of documents, as 
described above. Ma discloses a document model where the visual blocks are 
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specified in Figure 2, at reference sign 26A. Yang discloses a specification for a 
display area in Figure 3A. The specification for the display area would inherently 
include a specification to a start and an end pixel. 

28. Regarding claims 79-80, Yang discloses initializing a specification of an initial 
separator to include a display area that would be occupied by the entire document if 
it were displayed and initializing a specification of an initial separator to include a 
display area that would contain each of the plurality of visual blocks if they were 
displayed in Figure 3a and 3b, where Figure 3a shows the entire document being 
displayed in the display area, and Figure 3b shows each of the visual blocks 
displayed in the display area. 

Response to Arguments 

29. Applicant's arguments, filed 9/24/2007, with respect to claim 1 have been 
considered but are moot in view of the new ground(s) of rejection, as described 
above. 

30. Furthermore, applicant argues that "the nature and use of separators as required 
by claim 1 are not described in Yan^' (page 19, first paragraph of the response 
dated 9/24/2007). Yang is directed toward analyzing HTML pages based on visual 
cues to detect a semantic structure of the HTML document (abstract). Yang's 
analysis uses the visual similarity of HTML objects, and provides figures 2 and 3 as 
examples of typical web pages (first paragraph of Section 2, on page 2). Yang's 
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analysis includes the fundamental step of detecting boundaries between objects, 
where the boundaries between objects are "marked apparently with different visual 
styles or separators" (first paragraph of Section 2, on page 2). Yang's approach is a 
bottom up approach where the simplest visual objects are considered before 
complex visual objects (section 2 on pages 2 and 3). Yang detects the separation of 
visual blocks based on the characteristics of the visual blocks, by looking at the 
parameters of the visual blocks including such attributes as text font face, style, size 
and color (first paragraph of section 2.1). 

Conclusion 

31 . Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 
§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire. 
THREE MONTHS from the mailing date of this action. In the event a first reply is 
filed within TWO MONTHS of the mailing date of this final action and the advisory 
action is not mailed until after the end of the THREE-MONTH shortened statutory 
period, then the shortened statutory period will expire on the date the advisory action 
is mailed, and any extension fee pursuant to 37 CFR 1 .136(a) will be calculated from 
the mailing date of the advisory action. In no event, however, will the statutory 
period for reply expire later than SIX MONTHS from the date of this final action. 



Application/Control Number: 10/628,766 



Page 12 



Art Unit: 2178 

32. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to Gregory J. Vaughn whose telephone number is (571) 
272-4131. The examiner can normally be reached Monday to Friday from 8:00 am to 
5:00 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen S. Hong can be reached at (571) 272-4124. The fax phone 
number for the organization where this application or proceeding is assigned is (571) 
272-2100. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR 
only. For more information about the PAIR system, see http://pair-direct.uspto.gov. 
Should you have questions on access to the Private PAIR system, contact the 
Electronic Business Center (EBC) at 866-217-9197 (toll-free). 




Gregory J. Vaughn 
Patent Examiner 
November 30, 2007 



STEPHEN HONG 
SUPERVISORY PATENT EXAMINER 



